Local Rhyme-based Acoustic Features for Mandarin Tone Recognition

نویسنده

  • Dinoj Surendran
چکیده

We investigate the use in Mandarin tone recognition of over two hundred possible local acoustic features based on pitch, overall intensity, and band-passed intensity in the rhyme of a syllable. Features involving pitch height are not as useful as one might expect, showing the need for phrase-level pitch height correction. The intensity contour is useful, particularly when rhyme-initial intensity is subtracted. Intensity in certain medium and high-frequency bands also provides useful information. Unsurprisingly, contour tones are better recognized than level tones using only local features. In tonal languages, lexical information is carried both by phonemes and by syllable-specific intonation called tones. In the tonal language Mandarin, the five possible tones (high, rising, low, falling, neutral) carry as much information as vowels [1] [2]. Mandarin Tone Recognition is the problem of determining the tone of a syllable. Here, we assume that we know the syllable boundaries. Acoustic features for Mandarin Tone recognition can be found using duration, pitch, overall intensity, and intensities in various high-frequency bands [3]. However, there are several possible such features. Here we determine a useful subset of them that we can use in further experiments. For now, we deliberately stick to local features. Other than speaker-normalization, we will not consider features that use information outside the syllable boundary. Furthermore, we will limit ourselves to features computed on the rhyme of a syllable [4] to avoid the effect of syllable-initial consonants. Pitch and overall intensity measurements were found using Praat [5]. Band energy measurements were found using multi-taper spectral analysis [6] by considering overlapping 20ms chunks of speech every 5ms. 1 Features Considered The 221 local features we considered were the following. • duration : Duration of the rhyme in milliseconds. • # voiced : Number of voiced samples in the rhyme. • int(F) : mean, gradient, and intercept (all across the rhyme) of the contour energy between F−250 Hz and F+250 Hz, for F = 250, 500, . . ., 7500, 7750 Hz. There were 3× 31 = 93 such features. • We considered three acoustic measures, that we shall refer to as ‘cues’. Each was z-normalized by story before computing any features based on it.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incorporating tone-related MLP posteriors in the feature representation for Mandarin ASR

Tone has a crucial role in Mandarin speech in distinguishing ambiguous words. In most state-of-the-art Mandarin automatic speech recognition systems, tonal acoustic units are used and F0 features are appended to the spectral features (MFCC/PLP). However, a tone depends on the F0 contour of a time span much longer than a frame. Ideally, systems would compute the framelevel likelihood of a tone u...

متن کامل

F0 Contour Analysis Based on Empirical Mode Decomposition for DNN Acoustic Modeling in Mandarin Speech Recognition

Tone information provides a strong distinction for many ambiguous characters in Mandarin Chinese. The use of tonal acoustic units and F0 related tonal features have been shown to be effective at improving the accuracy of Mandarin automatic speech recognition (ASR) systems, as F0 contains the most prominent tonal information for distinguishing words that are phonemically identical. Both long-ter...

متن کامل

Context in Multi-lingual Tone and P

Tone and intonation play a crucial role across many languages. However, the use and structure of tone varies widely, ranging from lexical tone which determines word identity to pitch accent signalling information status. In this paper, we employ a uniform representation of acoustic features for recognition of both Mandarin tone and English pitch accent. The representation captures both local to...

متن کامل

Tonal articulatory feature for Mandarin and its application to conversational LVCSR

This paper presents our recent work on the development of a tonal Articulatory Feature (AF) for Mandarin and its application to conversational LVCSR. Motivated by the theory of Mandarin phonology, eight features for classifying the acoustic units and one feature for classifying the tone are investigated and constructed in the paper, and the AF-based tandem approach is used to improve speech rec...

متن کامل

Acoustic cues to tonal contrasts in Mandarin: implications for cochlear implants.

The present study systematically manipulated three acoustic cues--fundamental frequency (f0), amplitude envelope, and duration--to investigate their contributions to tonal contrasts in Mandarin. Simplified stimuli with all possible combinations of these three cues were presented for identification to eight normal-hearing listeners, all native speakers of Mandarin from Taiwan. The f0 information...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006